A Bijective String Sorting Transform

نویسندگان

  • Joseph Gil
  • David Allen Scott
چکیده

Given a string of characters, the Burrows-Wheeler Transform rearranges the characters in it so as to produce another string of the same length which is more amenable to compression techniques such as move to front, run-length encoding, and entropy encoders. We present a variant of the transform which gives rise to similar or better compression value, but, unlike the original, the transform we present is bijective, in that the inverse transformation exists for all strings. Our experiments indicate that using our variant of the transform gives rise to better compression ratio than the original Burrows-Wheeler than the original transform. We also show that both the transform and its inverse can be computed in linear time and consuming linear storage.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

On Bijective Variants of the Burrows-Wheeler Transform

The sort transform (ST) is a modification of the Burrows-Wheeler transform (BWT). Both transformations map an arbitrary word of length n to a pair consisting of a word of length n and an index between 1 and n. The BWT sorts all rotation conjugates of the input word, whereas the ST of order k only uses the first k letters for sorting all such conjugates. If two conjugates start with the same pre...

متن کامل

Extension and Faster Implementation of the GRP Transform for Lossless Compression

The GRP transform, or the generalized radix permutation transform was proposed as a parametric generalization of the BWT of the block-sorting data compression algorithm. This paper develops its extension that can be applied with any combination of parameters. By using the technique developed for linear time/space implementation of the sort transform, we propose an efficient implementation for t...

متن کامل

Simple Algorithm for Sorting the Fibonacci String Rotations

In this paper we focus on the combinatorial properties of the Fibonacci strings rotations. We first present a simple formula that, in constant time, determines the rank of any rotation (of a given Fibonacci string) in the lexicographically-sorted list of all rotations. We then use this information to deduce, also in constant time, the character that is stored at any one location of any given Fi...

متن کامل

A Text Transformation Scheme for Degenerate Strings

The Burrows-Wheeler Transformation computes a permutation of a string of letters over an alphabet, and is well-suited to compression-related applications due to its invertability and data clustering properties. For space e ciency the input to the transform can be preprocessed into Lyndon factors. We consider scenarios with uncertainty regarding the data: a position in an indeterminate or degene...

متن کامل

String Comparison in V-Order: New Lexicographic Properties & On-line Applications

V -order is a global order on strings related to Unique Maximal Factorization Families (UMFFs) [6,7], which are themselves generalizations of Lyndon words [14]. V -order has recently been proposed as an alternative to lexicographical order in the computation of suffix arrays and in the suffix-sorting induced by the Burrows-Wheeler transform. Efficient V -ordering of strings thus becomes a matte...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • CoRR

دوره abs/1201.3077  شماره 

صفحات  -

تاریخ انتشار 2009